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Abstract. We introduce a simple stochastic dynamics for game theory. It 
assumes "local" rationality in the sense that any player climbs the gradient 
of his utility function in the presence of a stochastic force which represents 
deviation from rationality in the form of a "heat bath". We focus on par- 
ticular games of a large number of players with a global interaction which 
is typical of economic systems. The stable states of this dynamics coincide 
with the Nash equilibria of game theory. We study the gaussian fluctuations 
around these equilibria and establish that fluctuations around competitive 
equilibria increase with the number of players. In other words, competi- 
tive equilibria are characterized by very broad and uneven distributions 
among players. We also develop a small noise expansion which allows to 
compute a "free energy" functional. In particular we discuss the problem 
of equilibrium selection when more than one equilibrium state is present. 



1. Introduction 

The economic world is a complex many-body dynamical system whose fluc- 
tuation phenomena has recently attracted much attention in the physicists 
community [1, 2, 3, 4, 5]. The identification, in economics and finance, of 
phenomena (such as scaling and multiscaling) which also occur in physical 
systems (such as critical phase transitions and turbulence) suggests that 
some of the knowledge and techniques developed in physics to understand 
fluctuation phenomena, might also be useful to understand fluctuations in 
economics and finance. 

Fluctuation phenomena in physics depend on the nature of the equilib- 
rium state. The starting point to understand fluctuations in economics is 
then a theory for economic equilibria. Game Theory [6] is the natural can- 
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didate: it describes the interaction among players' strategies and, assuming 
rationality, it identifies the possible game equilibria, named after Nash [7]. 

It is important to stress at this point that we shall deal mainly with 
economics and not with finance. Finance is, loosely speaking, a dynamical 
system out of equilibrium, where players gamble (speculate) on future mar- 
ket's fluctuations (see ref. [5] for a model). In an economic system, instead, 
the assumption of rational behavior is more realistic. However perfect ratio- 
nality is utopistic in real life. Deviations from rationality, which can arise 
from human errors, from limited or incomplete information or from ran- 
dom events, are practically unavoidable. Put differently one can say that, 
in real life, reducing human errors or the impact of random events costs 
time and money. Infinite precision is impossible if not at infinite costs. The 
analysis of the effects of "irrationality" becomes then an important issue to 
understand how game theoretical predictions can be modified in realistic 
situations. 

We shall address this issue for a "thermodynamic" economic system: 
a system with many degrees of freedom (players). The random events dis- 
cussed above have then the same qualitative nature of thermal fluctuations 
in statistical mechanics. One can indeed think that, in a thermodynamic 
system, each degree of freedom pursues the minimization of the (global) 
energy in the presence of random shocks due to thermal fluctuations. In 
the same manner we shall assume that in a macro-economic system, each 
agent pursues the maximization of his utility, under the effect of random 
shocks. In this perspective, Nash equilibria become analogs of ground states 
in statistical mechanics and deviations from rationality can be introduced 
in exactly the same way as temperature is introduced in statistical me- 
chanics. In particular, there are several dynamical ways, depending on the 
nature of the problem, to model temperature in statistical mechanics. This 
leads us to a natural definition of stochastic dynamics in game theory. 

In this work, which is an extension of a previous paper [8], we shall 
follow these lines, using the Langevin approach. We shall address the issue 
of fluctuations around Nash equilibria and the effects of deviations from 
rationality in some specific games with a large number n of players. We 
shall focus on games where each of the n players can control a continuous 
variable or "strategy" x«. He is endowed of a utility function U{ which 
depends on his strategy Xj as well as on that of other players {xj, j ^ i}. 
In the model, each player continuously adjusts his variable X{ in order to 
maximize his utility. He also faces stochastic shocks which affect his actions 
and, as a result, the variable Xi(t) becomes a continuous time stochastic 
process. The stochastic force acting on a player is similar to that arising 
from an "heat bath" at finite temperature in statistical mechanics. The 
dynamics allows for a comparison with equilibrium dynamics in statistical 



Figure 1. Global interaction among players in a stock market. 



mechanics, which we find a useful paradigm to discuss the results. In this 
comparison negative utility plays the role of an energy and the effects of 
fluctuations can be compared to entropic effects in statistical mechanics. 
The main difference between the two dynamics is that, while in statistical 
mechanics each degree of freedom evolve to minimize a global (free) energy, 
in game theory each degree of freedom maximizes his own (part of the total) 
utility. 

We shall focus on a particular class of games with a global interaction. 
By this we mean that the utility Ui of each player depends on others' players 
strategies Xj only through an aggregate or global quantity x, whose value is 
determined by all of them. This peculiar interaction, shown schematically 
in figure 1, reflects the nature of economic laws such as the law of demand 
and supply. 

Nash equilibria are stationary points of the dynamics. When we include 
fluctuations we find that two main equilibria exist: i) non-competitive equi- 
libria, where each player's equilibrium strategy is determine by its interac- 
tion with the rules of the game and ii) competitive equilibria, which result 
from the the aggressive competition of each player with all the others. For 
example, we shall discuss a game where the introduction of taxes determines 
both a non-competitive and a competitive equilibria. In the former it will 
be the balance between profit and loss due to taxes, which is important for 
each player. In the other, competitive equilibrium, taxes are negligible and 
the balance leading to equilibrium is only due to the competition among 
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all players. 

The main results that we shall find are: 

1. Competitive Nash equilibria are usually affected by very large fluctua- 
tions which increase with the number of players. Competition leads to 
broad distributions and large inequalities in an economic system. This 
is reminiscent of Pareto Law of distribution of incomes [9]. We shall 
find that inequalities increase with the number of players. 

2. Competitive equilibria are also characterized by large relaxation times 
which are proportional to n, and by a negative correlation between 
players' strategies. This means that two players tend to have opposite 
fluctuations around the Nash equilibrium. 

3. At odd with statistical mechanics, where fluctuations always increase 
the system's energy, we shall find that in a game theoretical system, 
under particular conditions, fluctuations increase the utility (i.e. de- 
crease the "energy"). 

4. We can, in principle, compute the stationary state distribution in strat- 
egy space. This allows to solve, for example, the problem of equilibrium 
selection in games where more than one Nash equilibrium exists. 

5. Fluctuations, in general, displace Nash equilibria and in some cases, 
for strong enough randomness, a Nash equilibrium can also disappear. 

6. Our approach also suggests that time-scales for the transition from 
one Nash equilibrium to another one are proportional to exp[— AF/D], 
where AF is a "free energy barrier" and D is the noise strength. 

The paper is organized as follows. The next section reviews game the- 
ory and discusses some simple game. We also discuss briefly evolutionary 
game theory and its differences with our approach. Section 4 introduces the 
class of models we shall analyze. The following section discusses gaussian 
fluctuations around Nash equilibria. Then we develop a general approach 
to the stationary state probability distribution in strategy space. The main 
results are illustrated with simple examples. In the final section we summa- 
rize the results, we draw some conclusions and comment on possible further 
extensions. 

2. Game theory and evolution 

An economic system consists of a large number of interacting agents. In the 
game theoretical setting, each agent has a spectrum of strategies parametrized 
by an index x. Each player i is also endowed of a utility or payoff function 
Ui which depends on the strategy Xj he plays as well as on that played 
by all other players. With the notation x_j = {xj, j / i}, we can conve- 
niently write Ui = Ui{xi also called pure strategy as opposed 
to mixed strategies Hi(x), in which strategy x is played with probability 
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Hi{x) by player i. Under mixed strategies, the strategies used by the play- 
ers are independent random variables. Independence is justified in one stage 
games, which are played just once and each player has to decide his strategy 
simultaneously, without information about what others will do. 

Game theory, in its simplest setting, assumes that the payoff functions 
are common knowledge and that each player behaves rationally. Rationality 
is also common knowledge, which means that each player knows all other 
players are rational (these are so-called complete information games) . Game 
theory aims at predicting the possible stable states of the system, which are 
called Nash Equilibria [7]. The strategies x* are a Nash Equilibrium (NE) 
if each player's utility, for fixed opponents' strategies x*_ ri , is a maximum 
for x.i = x*, i.e. «j(x*, xlj > Uj(xj, x!_J, Vxj. The NE strategies x* are 
such that no player has incentives to change his strategy, since any change 
would cause a utility loss. Nash showed [7] that any game has at least one 
Nash equilibrium in the space of mixed strategies. 



2.1. THE COURNOT GAME AND THE TRAGEDY OF THE COMMONS 

The concept of NE is best illustrated by a simple example, originally 
introduced by Cournot in 1838 [10]: 2 firms produce quantities X\ and 
X2 respectively, of a homogeneous product. The market-clearing price of 
the product depends, through the law of demand-and-offer, on the total 
quantity X = x\ + X2 produced: P(X) = a — bX. The larger X the 
smaller P is. The model assumes that the cost of producing a quantity 
Xi is cxi and c < a. The firms choose their strategies (i.e. Xj) with the 
goal to maximize their profit. We can then define the payoff function as 
Ui = Xi[P(xi + X2) — c] = Xi[a — c — b{x\ + X2)]. The problem is to find 
Xi assuming that both firms behave rationally. One way to do this is by 
means of the concept of best response. The best response of a player i to 
the opponents strategies x-i, is the strategy Xi = x*(x_j) which maximizes 
Ui(xi,X-i). In the Cournot game, the best response x\{x2) of firm 1 to any 
given strategy X2 of firm 2 is obtained by maximizing u±(xi,X2) with re- 
spect to X\ with fixed X2, i.e. xj(x2) = (a — c — bx2)/(2b). Firm 2, knowing 
that 1 behaves rationally (i.e. that it will play xf (X2) whatever X2 is) will 
choose x\ which maximizes u<i{x\{x-i), X2). It is important to stress that 
operationally this means 



d 

■^—U 2 {X 1 ,X2 / 
OX 2 



= 0, 

xi=xl(x2) 



i.e. the maximum of U2 must be found at fixed x\. This leads to x\ = 
x* 2 = (a — c) / 36. This simple example shows that rationality, and the as- 
sumption of other's rationality, plays a crucial role in the concept of Nash 



6 



MATTEO MARSILI AND YI-CHENG ZHANG 



equilibrium [7]. It is easy to generalize this game to n firms. Let us set, for 
convenience, a — c = 1 and b = 1/n. Then m(xi,x-i) = Xi(l — x), where 
x = (x± + . . . + x n )/n is the average of Xj. The calculations generalizes 
straightforwardly and we find [8] a NE at Xi = xq = nj(n + 1) and a per 
player payoff Ui = n/ (n + 1) 2 . This celebrated example, is also known as the 
Tragedy of the commons[\V\. In this formulation of the problem, the utility 
Ui = XiV(X) depends on a common resource V{X) = c — P(X) which, at 
the NE, is almost totally depleted by the aggressive behavior of players. As 
a result each player receives a very small payoff. This is an example of a 
competitive NE where the strategies of players are not limited because of 
a direct loss in utility, but because the global resource is almost exhausted 
by the aggressive, competitive behavior of players. 

2.2. REPEATED GAMES AND EVOLUTIONARY GAME THEORY 

This setting generalizes to repeated games, where different stage games are 
played a finite or an infinite number of times. In repeated games a strategy 
must describe the action the player has to do at each stage. Also the single 
stage utility is generally replaced by an utility which accounts for many 
stages, usually with a discount factor (i.e. an exponential weight function 
for future utility). All these things make the analysis much more complex 
than in single stage games. 

A game posses, in general, several NE's, and this raises the question of 
which of them is more relevant. In order to answer this question, several 
definitions of stability have been proposed [12, 13, 14]. The most successful 
approach to stability has been that of evolutionary game theory [15, 12]. 
This considers a game with mixed strategies as a game played by a popu- 
lation of players playing pure strategies with random opponents. 

This idea has attracted much interests in the community of theoretical 
population biologists, which have translated this idea into a mathematical 
model: the so called replicator dynamics[16\. Though several versions of this 
dynamics have been proposed [16, 12], in its simplest form, it assumes that 
the individuals playing a given strategy reproduce at a rate proportional to 
their utility. Stochastic fluctuations, in the population dynamical setting of 
replicator dynamics, have also been considered in refs. [17]. 

There are some points that are worth to point out in evolutionary game 
theory. The first is that its application has been mostly limited to two play- 
ers games. Indeed its generalization to contexts with n players is technically 
very complex[12]. The second is that rationality is not assumed. Players are 
on the contrary rather dull: they just keep playing the game the way their 
ancestors did. Rational NE results from the selective evolution of replica- 
tor dynamics. Finally we note that replicator dynamics assumes that the 
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strategies Xi are independently chosen by each player . It also assumes that 
Xi(t) is independent of Xi(t') for t' / t. 

In contrast with these observations, our goal is to study a simple realis- 
tic dynamics for anri> 1 players game with "almost" rational players. We 
shall do this weakening the assumption of perfect rationality with the intro- 
duction of a "thermal" noise. Therefore xi(t) will become a continuous time 
stochastic process. We shall therefore pursue quite different purposes and 
use totally different techniques than those of evolutionary game dynamics. 

3. The Langevin approach 

Focusing on a class of models with a continuum spectrum of strategies Xi , 
we recently proposed [8] a Langevin dynamics of the form 

dtXi = T~- + rn. (1) 

dxi 

Where the stochastic term r]i(t) models deviation from perfect rationality. 
We take rji(t) gaussian with (i]i(t)} = and 

(r, i (t)r) j (t / )) = 2A i!j 6(t-t'). (2) 

Eq. (1) is a model dynamics which contains both the deterministic ef- 
forts each player exerts to increase his payoff and the effects of random 
events. The deterministic part assumes "local rationality" of the agents: 
each agent knows which is the direction in which his utility increases. In 
other words, it assumes that each agent knows the utility function U'l as a 
function of in a small neighborhood of Xj. Note that this weakens the 
assumption of perfect rationality, according to which each player knows the 
function Ui for any Xi and any Xj, j = 1, . . . , re. 

The stochastic term rji, represents all hindrances which prevent a ratio- 
nal behavior. These may be internal, i.e. affect only one player (e.g. illness), 
or external if they affect equally all players (e.g. earthquake). This suggests 
that r]i is composed of two components, rji = fji + rjo, where the fji's are 
independent gaussian forces. This motivates our choice 

Aij = DiSij + D (3) 

for the correlation of r^. Here D- L is the strength of the stochastic force fji 
acting on player i, whereas Dq is that for events fjo which affect all players 
in the same way. 

1 The joint distribution of the strategies factorizes into the single players distribution 
functions fj,i(x,t). 
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Clearly, since NE are denned as the set of Xi for which the equations 



are simultaneously satisfied, for Ajj = 0, NE are stationary points of the 
dynamics. Equation (1) is also very appealing since, comparing eq. (1) with 
model A dynamics[18], it allows for an analogy with statistical mechan- 
ics. The main difference with statistical mechanics is that in game theory 
each degree of freedom (player's strategy) maximizes a different function 
(player's utility) whereas in statistical mechanics each degree of freedom 
minimizes the same function (energy or Hamiltonian) . Also, at odd with 
statistical mechanics, the interactions among strategies need not be sym- 
metric: players' goals may be in conflict with one another. 

In this analogy, NE are analogs of zero temperature (meta) stable states. 
The Langevin approach includes "thermal" fluctuations in game theory, and 
this allows to analyze stability of NE through the study of fluctuations. It 
also gives a "free energy" measure, which enables to distinguish the real 
"ground state" from "meta-stable" states. Indeed in an ideal slow cooling 
down where Ajj — > 0, analogous to simulated annealing, only the state with 
the smallest "free energy" is selected independently of the initial conditions. 
This contrasts with the evolutionary approach, where the final state is 
uniquely determined by the initial conditions. The Fokker-Planck equation 
associated to eq. (1) provides a description of the game at the level of the 
distribution of X{. At odd with replicator dynamics (which also involves 
the distribution functions of the Xj), this does not assumes that the Xi are 
independent. As we shall see, correlations indeed arise. 

4. The model 

Many complex systems in economics have a very peculiar form of interaction 
(see figure 1). In a stock market, for example, each agent guesses whether to 
buy or to sell a stock, looking at the stock's price fluctuations. These fluc- 
tuations are in their turn produced by the cumulative effect of the actions 
of all the agents in the market, through some form of demand-offer law [5]. 
Each player interacts with a signal, which in its turn is determined by the 
collective effect of all players. A further example is the above mentioned 
Cournot model, where n firms produce the same good, and the market 
clearing price is determined by the ratio between the aggregate production 
and the demand. 

Focusing on this kind of interactions, we consider in the following situ- 
ations where the payoff function for player i is 



o Ui(Xi, X—i) — 0, 



(4) 



i 



Ui(XiiX—i} — -S(x2,x), 



X\ + . . . + X, 



n 



(5) 



x = 



n 



STOCHASTIC GAME DYNAMICS 



9 



In other words, the payoff for player i depends on Xj for j ^ i only through 
the aggregate quantity nx. The n-players Cournot game, is of this form with 
—B(x,y) = xV(y), and has been discussed at length in ref. [8]. 

Because of the symmetry of the interaction, the NE are symmetric: 
x* = x* for all i, and x* satisfies the equation 



duj 
dxi 



= -B lfi (x*,x*) - -B 0:1 (x*,x*) = (6) 

Cj=x* n 



where we defined, for convenience, 

03 Qk 

B jy k{x,y) = -^--^B{x,y), B 0fi (x,y) = B(x,y). 

A NE must not only be an extremum of Ui, it must also be a maximum 
with respect to Xi at fixed opponents' strateg ICS X — 2 — *^ 

*. This requires 



d 2 m 



8x 



77-1-1 1 

-£? 2 ,o(x*,x*) - -Z_Si,i(a;*,a;*) - -^B , 2 (x*,x*) < 0. (7) 



It is worth to emphasize that eq. (6,7) are necessary but not sufficient for 
x* to be a NE. Indeed a NE must be globally stable, which means that 
tij(xj,x_j = x*) must have a global minimum at Xj = x*, Vi. 

5. Fluctuations around a Nash equilibrium 

Let Xi = x* be a NE for our model. Without loss of generality we can set 
x* = by a linear transformation *. We shall also set 5(0, 0) = 0. 

We can then investigate the small gaussian fluctuations around the NE 
resulting from the Langevin dynamics (1). Expanding the deterministic 
part to leading order, we arrive at the equation 

Xi = -Ti ( gk,j + -)xj + rn (8) 



where 



n 



g = £ 2) o(0,0) + ^Bi i i(0,0), 

h = (0,0) + - B ,2 (0,0). 

n 



Stability requires that all the eigenvalues of the matrix Gij = Ti(g5ij+h/n) 
must be positive. These are given by the equation 
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A graphic inspection of the solutions of these equations, shows that if 

g>0, h + g>0, (10) 

then all eigenvalues are positive. Note that, in terms of g and h the local 
stability condition (7) reads g + h/n > 0. For n > 1 this is clearly satisfied 
if the conditions (10) are met. 

Equation (8) is a multivariate Ornstein-Uhlenbeck process [19]. Fluctu- 
ations around the NE are described by the matrix Oij = (xiXj) of correla- 
tions. This is the solution of the set of linear equations Go + oG T = 2A, 
where Gi j = Ti(g6ij + h/n) and A is the matrix of the noise correla- 
tion given in eq. (3) [19]. Introducing the vector vi = J2j a i,ji this matrix 
equation can be reduced to 

h Dj + Dp , . 

g<?i,i + -vi = — - — (11) 

ft Lj 



I . u r I i u r ^ A , 2nD 

\q + h— —\Vi + h— = — + — — . (12) 

V r^ + ry Ti + v r, r.ir v ' 

Here we have introduced the notation / = ^ J2k fk f° r averages over the 
ensemble of players. Note that V{ = n{xix) is the correlation between the 
variable X{ and the global variable x. 

Qualitatively there are two different cases according to whether the 
Ti are broadly distributed or not. We shall first focus on the second case, 
when the average value of Tj is much larger than the fluctuations around it: 
(r — r) 2 <C r . With a redefinition of the scale of B, we set, for simplicity, 

T = 1. In the limit n — > oo and e = \J (V — V) 2 <C 1, the values of Tj are 
densely distributed in a small interval of size e. In each interval [r, T+dr), 
dT <C e we can define an average value of Di, cr^ and Vi, which we denote 
by D(T), cr(r) and v(T). This allows for a systematic expansion in powers 
of e. 



For e = 0, one easily finds 
v(l) = v = 



D + nDo 

g + h ■ 



n(g + h)J g g + h 

Note that v(l) > 0, which means that each variable X\ tends to fluctuate in 
phase with x. We can also compute an ensemble average of the correlation, 
which for e = reads 

U -n(n- 1) ^ {X ' tX > } ~ n-\~ g + h ng(g + h) " 
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The common stochastic force, as it could be expected, gives a positive 
contribution to the correlation, and for Dq large enough, the correlation 
always turns positive. 

With respect to the dynamics in the stationary state, correlation func- 
tions decay exponentially 

(xi(t + to)xj(to)) - <Ji,j oc e~*/ r . 

The correlation time r of the leading exponential behavior is given by the 
minimum eigenvalue r = max^(A^ 1 ) in eq. (9). 

It is finally possible to compute the average utility 

M= -^,0-^-^0,2 (15) 

where all derivatives of B are evaluated at (0, 0). Note that in view of our 
choice B(0, 0) = this expression yields the deviation of the average utility 
from a completely rational behavior. As we shall see it is possible that 
fluctuations increase the average utility. The last term in eq. (15) is the 
average of the utility at the Nash equilibrium in presence of fluctuations: 

(Ui{0,X-i)) = ~7^ B 0,2- 

This would be the utility of a player which maintains the NE strategy 
Xi = even in presence of fluctuations. It is interesting to note that it 
is possible that (u) > (uj(0, x-i)). Loosely speaking this means that in a 
game with random deviations from rationality players who are affected by 
the randomness may receive a higher payoff (on average) than those which 
behave rationally (xi = 0). The condition for this to happen is 

-aB 2:0 + -B 1A <0. (16) 

_■ TX 

Let us now discuss these findings. As expected the fluctuations of xi 
grow with Di and -Do. There are three qualitatively different cases, as shown 
in figure 2: 

a) when .62,0,-61,1 and i?2,o are all finite and positive. The point (g, h) 
lies well inside the domain defined by eq. (10). As a result we have 
a normal behavior with small fluctuation. Fluctuations increase the 
average utility and a rational behavior is generally more rewarding. 

b) £2,0 — 0, B11 and l?2,o are finite and positive. Then g ~ 1/n is small 
and, from eq. (13), we see that fluctuations are proportional to n. 
This, in view of the explicit factor g in front of o^j in eq. (11), is a gen- 
eral feature which holds also for broadly distributed Tj . The condition 
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a) b) c) 




Figure 2. Quadratic approximation of the utility function close to a Nash equilibrium, 
a) B2,o,Bi t i and i?2,o are finite and positive; b) i?2,o — 0, > and £?2,o > 0; c) 

£?2,o + B\ t i ~ 0. 



i?2,o — obtains for example for utilities of the form B(x, y) = —xb(y), 
which describes e.g. the tragedy of the commons problem[ll, 8]. We 
shall discuss in more detail this class of models in the next paragraph. 
Generally i?2,o = is typical of competitive equilibria. Indeed it means 
that each player does not feel the effects of a change in his x« directly. 
Rather it feels it indirectly through the reaction of other players, or 
better through the effects this change has on the global variable x. 
Large fluctuations are a result of the fact that the dependence of x on 
Xi is weak. Competitive equilibria are also characterized by negatively 
correlated variables Xi for Dq small enough: C < 0. Large fluctuations 
come together with large relaxation times. Indeed eigenvalues are pro- 
portional to g, so that for g <C 1 all of them are small, yielding large 
relaxation times r ~ n. Finally the average utility is decreased. 
c ) -B2,o+i?i,i ~ 0. Also in this case large fluctuations occur since g+h ~ 0. 
The divergence of the terms proportional to Dq suggests that the mode 
with stronger fluctuations is associated with x. For h + g small the 
smallest eigenvalue is small. This results in a large correlation time 
r = T~ 1 /(g + h) + 0(1). At odd with the case b), only one eigenvalue 
is small in this case, the others being 0(1). The (right) eigenvector 
associated with this eigenvalue is W{ = 1 + 0(g + h) nearly constant. 
The slow mode characterized by large fluctuations is then associated 
with x. Note that also C ~ (g + h)^ 1 is large and positive, which 
means that Xi fluctuate in phase thus yielding a large fluctuation of 
their sum. Finally fluctuations may decrease the average utility in this 
case. Furthermore if player i behaves rationally (xi = 0) he receives a 
smaller payoff (u) > (uj(0,x_j)). 
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These results can be extended to higher order in e. The idea is to as- 
sume that Tj are distributed around 1 according to a gaussian density with 
standard deviation e. We shall limit our discussion to the first term here. 
Taking the average of eq. (12), multiplied by Tj — 1 over this distribution 
and taking the leading order in e, gives 

v > (1) 2zy W anPp 

{ ' 9 + h 2g + h (g + h)(2g + h) 

9 an 

These equations give interesting informations. For example cr'(l) < im- 
plies that the fluctuations experienced by a player are smaller the faster his 
dynamics (i.e. the larger his rate constant T). This is what one naturally ex- 
pects and it occurs when D'(l) < D(l)+D [l-gh/(g+h)(2g+h)] + O(l/n). 
On the other hand, if D(T) grows sufficiently fast with T, one has cr'(l) > 0. 
This suggests that generally the fluctuations of a variable Xi grow with Di 
and decrease with Tj. The same kind of information can be obtained for 
the correlation C. The case Dq = yields a compact expression: 

C , g + h D' 

— = -1 + . 

C 2g + hD 

This says that correlations are weaker for faster variables, unless D(T) 
grows fast enough with T. We shall not discuss the case Dq > 0, which 
leads to less transparent formulas. 

The case of broadly distributed Tj needs a more detailed knowledge of 
the parameters. However we expect that the results obtained by the small 
disorder expansion above qualitatively describe the system. Note that eq. 
(11) suggests that (xf) oc Z^/Tj diverges as Tj — > 0. In a large system 
of players with broadly distributed Tj, the smallest Tj can be vanishingly 
small as n — > oo. For example, in a system where the Tj are distributed 
with a density p(T) ~ T 13 ^ 1 for T <C 1, one expects that the smallest Tj, 
in the population of players, is T m i n ~ n" 1 /' 3 . In this case some player can 
have fluctuations growing with n. It is worth to stress, however that such 
a distribution of Tj implies a power law distribution of characteristic times 
l/Tj, which might not be realistic. 

5.1. AN EXAMPLE 

Let us illustrate these findings with simple examples. An example with an 
utility of the form B(x,y) = xy has been discussed in detail elsewhere [8]. 
The main point raised by this example is that of the emergence of large 
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fluctuations. Here we shall describe a second example: 

B(x,y\p) = ^(x - pyf. 

The utility function Ui(xi,X-i) = —B(xi, x\p) above describes a classical 
game where each player has to throw a number xi with the aim of guessing 
a fraction p of the average x of all players' guesses. This clearly has only 
one NE x^ = 0, Vi. B(x,y\p) can also be assumed as a local approximation 
of a more complex utility around one of its Nash equilibria. 

With the choice b = (1 — p/n)^ 1 , the parameters are g = 1 and h = —p. 
The stability condition requires that p < 1, which is intuitive because if 
players where told to guess a number larger than the mean everybody 
would tend to overshoot and Xj — ► oo. On the contrary with p < 1 every 
player has to be careful: he must play a number which is smaller than the 
one played by the others. In the extreme case p < he has to try to do 
the opposite of what the majority does. Let us discuss only the results for 
e = 0. It is straightforward to find 



n(l-p)J 1 



c = -J^ + D ° 



n(l-p) 1-/3 

Note that, for Dq = 0, C has the same sign of p. For p > 0, a player 
attempts to guess the fluctuation of others and as a result it tends to make 
fluctuations in the same direction as the others. On the other hand, for 
p < a negative correlation arises. 

The average utility is (u) = —\D— f^E^A) whereas the condition (16), 
after some algebra, reads: 

[n - (n + l)p]D + (1 - 2p)D < . 



In order for this condition to hold at least one of the two terms must be 

2 ^ y ^ n+l- 



negative. For \ < p < -^y, it becomes "favorable" to follow the random 



force for 

„ n — (n + l)p— 

U ~ n(2p-l) 

In other words, if the global component of the stochastic force is strong 
enough, it is not convenient to play the NE strategy Xi = 0. 

This behavior can be qualitatively explained as follows: Each player has 
to try to follow as closely as possible the global variable px. The latter 
evolves under a stochastic force fj of strength D/u + Dq ~ _D . The random 
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force rji acting on each player has a component of strength Dq along px. 
If this component is large enough, each player can guess correctly whether 
the mean x will move left or right and so it becomes favorable to follow the 
stochastic force. 

F° r P > ^qrr the condition holds \/D > with Dq = 0. This region is 
also characterized by large correlated fluctuations (note that g+h = 1 — p~ 
1/n). Even in the absence of a global stochastic force, the dynamics leads 
to a state where the X{ are highly correlated. In such a state, the strategy 
X{ = is less rewarding than the average. 

For p < \ there are no values of D and Dq for which the condition (16) 
is satisfied. 



6. Probability distribution in strategy space 

In this section we shall extend the analysis of our model to study the full 
probability distribution in the stationary state. A complete treatment is 
not possible in general. We shall restrict attention to the case 

Vi = 1, A = D . 

In view of our discussion of the previous section, Tj = 1 means that all 
players have the same characteristic time-scale. Qualitatively, we expect 
that the results below apply also in case of narrowly distributed time-scales. 
Our equation is 

±i = -B ljQ (xi, x) - ^-B 0)1 (xi, x) + rji. (17) 

It is useful to introduce the variables 

1 k 

Vk = 7m i fr Z^~ Xfc +i)> k < n ( 18 ) 
y/k{k + l) ^ 

X\ + . . . + X n !—_ . . 

- = \Jnx. (19) 



in 

The transformation x — > y is orthonormal, which implies the useful identity 

n n 

E*? = Ei£ ( 2 °) 
i=i fc=i 

The same transformation, applied to the noise term ff —> ( leads to a 
stochastic force Cfc in the equation for ijk which has a correlation 

<0(t)Ck(O> = 2S J,k(D + nD 5 ktn ) (21) 
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which is diagonal. The common stochastic force Dq acts on y n only. For 
convenience, instead of y n , we shall use the variable x = y n /^/n. The noise 
fj appearing in the equation for x has a strength T = D/n + Dq. 



6.1. LINEAR UTILITY 

Let us first consider the model 

B(x, z) = xb(z), 

which allows for a full solution for the stationary state distribution of Xi. 
A situation described by this kind of utility function [11, 8] is a system 
where n firms produce a quantity x« of a homogeneous product. Then —b 
is the difference between the market clearing price of one unit of product 
and the production cost of one unit. We assume that it depends only on 
the aggregate production 

j X'l — TLX (the production cost per unit is a 
constant). The payoff U{ = —Xib of firm i is then proportional to its pro- 
duction. In realistic situations b(x) is an increasing function. One expects 
e.g. that because of competition, the price —b of a product decreases with 
the total quantity produced, in view of the law of demand and offer. 

The NE Xi = x* is defined by b(x*) = —x*b'(x*)/n. Note that the payoff 
per player 

x* 2 b'(x*) 



Ui = —x*b(x*) 



n 



is positive and proportional to 1/n. 

The orthonormal transformation x — ► y, yields 



b'(x) 

Vk = —Vk + Ck, k<n (22) 

n 

b'(x) 

x = -b(x) —x + f). (23) 

n 

The equation for x does not involve other variables and can be directly 
solved yielding the distribution p n (x). The equations for y^ depend only on 
x. Treating x as a parameter, one can find the conditional distribution of 
Vk'- p{Vk\x)- The full distribution is then 

n-l 

P(Vl, ■■■,Vn)= Pn{x) JJ p(y k \x). 

k=l 

Back transforming to the variables Xi, yields the solution. Eq. (23) describes 
a "particle" in a potential with thermal fluctuations and can be solved using 
standard techniques [19]: 

/-\ kt \ xb{x) + (n-l) Jo dzb(z)} 
Pnix) = AAexp (24) 
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with J\f a normalization factor. The equation for similarly gives 



. ._. V(x) J b'{x)yl\ 

where here normalization requires b'{x) > 0. Using the equation (20), one 
can easily find the full distribution of xf. 



p(x!,...,x n ) oc 



b'(x) 
irnD 



n-l 
2 



1 lW^ {Xl ~* )2 
\ 1 = 1 

xb(x) + (n - 1) J* tfc6(z) \ 
D + nD ] 



where x = (x\ + • • • + x n )/n. 

Some implications of this result have already been discussed in ref. [8]. 
In particular it was observed that if b'(x) ~ 0(1) one finds fluctuations of 
order (x 2 ) oc nD and large relaxation times r ~ n. We note furthermore 
that p(xi, . . . , x n ) vanishes as b(x) — > + . For b'(x) < 0, which corresponds 
in any case to an "unphysical" situation 2 , we must set p{x\, . . . ,x n ) = 0. 
Note that eqs. (22, 23) imply that if the system is "prepared" at t = in a 
state with b(x) < 0, in the early stages of the dynamics, the deviations xi—x 
increase exponentially. This is clearly a highly non-equilibrium situation. 

The average utility, to order D + uDq is given by 

{Ul) - Xb{X > 2[(n + l)b'(x*) + x*b"(x*)} ■ [b) 

If -g£z[xb(x)] x=x * < 0, then the effect of fluctuation will be that of increasing 
the average utility (note that, in the notations of the previous section, 
(n+l)b' \x*)+x*b" \x*) = n{g+h) > 0). An example, which allows for simple 
expressions, is b(x) = —1 + x — ^ax 2 . Since b'{x) = 1 — ax we need to restrict 
the range of x to x < 1/a in order for b'{x) > (otherwise x would flow to 



oo). The NE is at x* = ^ (l - ^1-20 + ^) ~ 1(1 - ^T^M) 

and its existence requires a < 1/2. The payoff per player at the NE, to 
leading order in n, is — x*b(x*) ~ ^-(1 — v 7 ! — 2a) 2 y/l — 2a. With gaussian 
fluctuations, we find: 



(iii) ~ — x*b{x*) 



, _ a 2 (3Vl - 2a - l)(D + nD ) 
2(1 - 2a)(l - VI - 2af~ 



2 For example, in the firms problem, b'(x) < means that the price increases if the 
production is increased. 
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For a > 4/9 fluctuations increase the average utility, an effect related to 
the asymmetry of the function b(x) around x*. 

The average (x) instead, has no corrections to order D. As we shall see 
this does not hold in general. It does neither hold when the function b(x) 
changes rapidly close to the NE, a situation which cannot be described in 
the gaussian approximation. Consider for example, the game of the tragedy 
of the commons, where the utility function is much steeper when negative 
payoff arise: b(x) = x — 1 for x < 1 and b(x) = q(x— 1) for x > 1 and q S> 1. 
This represents a situation where each player is very scared of receiving 
negative payoffs. Clearly, as far as the NE is concerned, no change occurs: 
The NE x* = n/ (n + 1) is always the same, dangerously close to the edge of 
negative utilities. In the presence of fluctuations, however, the distribution 
of x is very asymmetric on the two sides of the NE. For x > x* it drops 
off much more quickly than for x < x* . As a result the NE is shifted by an 
amount of order y/D/n towards safer values of x < x*. We see then that 
fluctuations can induce a more cautious behavior. 

The most general model which allows for a complete solution, with the 
above technique, is with B(x,z) = Bq(z) + xb(z) + ex 2 . The term Bq(z) 
changes only the distribution of x by a factor proportional to exp[— Bo(x)/n], 
whereas the term cx 2 also affects the distribution of x\. A simple realization 
of this model, with c = 0, is the one where players cooperate to increase 
each other's utility: Bq (z) = f3zb{z) (j3 > 0). Of course (3 < means 
anti-cooperation, i.e. each player tries to maximize his utility as well as to 
minimize that of others. With b(z) = z — 1, one easily finds that the NE is 

at x* = n +2p+i and tne Payoff P er player is u* = • A nigh de § ree 

of cooperation, (3 oc n leads to a finite utility per player. On the contrary 
fluctuations always remain large (5xf) ~ nD. For (3 — ► oo, as discussed 
in [8], fluctuations diverge even though the average utility remains finite. 
Clearly anti-cooperation f3 < decreases the utility. However for (3 < — 1 
fluctuations give a positive contribution to the utility. Fluctuations increase 
the average utility in over-competitive systems (those in which each player 
main efforts are devoted to decrease other players' payoffs). 



6.2. THE GENERAL CASE 

The general case does not allow for a full solution. It is however possible to 
compute the stationary state distribution to leading order in D. We assume 
that 

y k ~ xi - x ~ 0(y/D). (27) 

While this is surely satisfied close to a NE (when all close to x*), it 

might not hold in non-equilibrium situations or when rare events such as 
large fluctuations occur. 



STOCHASTIC GAME DYNAMICS 



19 



The equation for derived from eq. (17), contains terms Bj t k(xi,x) — 
Bj >k (xk+i,x) with (j,k) = (1,0) or (0,1). Expanding in powers of 
and Xfc + i — x, we find, to leading order 



Vk = - 



B2,o(x,x) + -B 1:1 (x,x) 



Vk + Ck 



where Cfe is still gaussian uncorrected noise, in view of the orthonormality 
of the transformation x — > y (see eq. 21). This equation is valid to O(D), 
since we neglected terms proportional to (xj — x) 2 which are of order D in 
view of eq. (27). Using eq. (21), one easily finds: 



hik\ x ) 



nD 



nB 2: o(x,x) + Bi tl (x,x) 



(28) 



where we used the notation (y\x) for the average of the quantity y condi- 
tional to the value x assumed by a second variable. Note that the require- 
ment (y\\x) > 0, implies nB 2 fi(x, x) + Bi^(x, x) > 0. This condition, which 
generalizes the condition b'(x) > in the previous paragraph, restricts the 
range of possible values of x. 

Let us now move to the equation for x. Expanding the right hand side 
of the equation for X{ to second order in x% — x, we find 



x= - Bi (x,x) B i(x,x) 

n 



B3,o(x,x) + -B 2 ,i(x,x) 



n 



x) 2 + 



1=1 



+ rj 



where (fj(t)fj(t')} = 2(D/n + D )5(t - t'). In view of eq. (20), we have 
=i Vk- Taking the average over y k conditional to the 
value of x in this equation, we substitute ^J2i( x i — x) 2 with (yt\x) ■ 
This results in the equation 



x 



—Bi,o — —B ^i 



1 

n 



(n - 1)D nB 3fi + B 2 ,i 
2n nB 2 fl + £1,1 



where we suppressed the dependence on (x,x) of B^j. 

The steady state distribution of x is then given by P(x) oc exp[—F(x)/T\, 
where the "temperature" is T = D/n + Dq , and the "free energy" is given 
by: 



F(x) = f 



r , (n-l)DnB 3i0 + B 2il 

Bl >° + n B ^ + An nB 2fi + B^i 



(29) 



20 



MATTEO MARSILI AND YI-CHENG ZHANG 



It is useful, for the following discussion, to split F = U — DS into a D 
independent part and into a D dependent one: 



U{x) = f 



Bi,o(v,v) + -Bo,i(y,y) 



dy 



S(x) = n ~ 1 [ X dy™ B3 '°( y,y " > + B2 ' l ^ y ' y ^> 



and 

2n J nB 2 fi(y,y) + B 1A (y,y) 
The relation F = U — DS is reminiscent of a free energy in equilibrium 
statistical mechanics, which is a useful paradigm to discuss our system. 

6.3. DISCUSSION AND APPLICATIONS 

It is worth to stress that the function U(x) is not simply related to the util- 
ity. This contrast with equilibrium statistical mechanics where the proba- 
bility of a state is directly related to his energy. 

The "entropic" term S(x) results from the inclusion of the fluctuations 
of the variables Usually, in statistical mechanics, one finds that the 
larger the fluctuations in a state the larger the entropy is. We shall see in 
the following that this does not hold for our system: S can be larger for 
"ordered" states than for states with wild fluctuations. 

Fluctuations displace the NE (defined as the minima of F(x)) of a quan- 
tity of order D: 

x*(D) = X *(0)- (n- 1)^3,0 + ^,0 



2n?{nB 2fi + B ltl )[nB 2fl + (n + 1)B 1A + B 2 ,o] ' 

Note that stability requires that the denominator be positive. 

The analysis of any particular case goes as follows. First one needs to 
determine the range of x where our approach can be applied. This is given 
by the condition nB 2 fi{x } x)+Bi : i(x, x) > which ensures that (yl\x) > is 
finite. Outside this range, the behavior cannot be described perturbatively 
in D (i.e. the gaussian approximation for the variables yt is no longer 
valid). Secondly find all solutions of eq. (6), nB\fl + i?o,i = 0, which are the 
candidates for NE's. Each solution to this equation must then be checked 
for stability. If eqs. (10) are verified, the equilibrium is stable. Finally for 
each stable equilibrium one can evaluate its free energy F(x*) from eq. (29). 
This gives the statistical weight of each state in the stationary regime. The 
state with the smallest F(x*) is the one with a larger statistical weight and 
it dominates in the limit T = D/n + Dq — > 0. 

In ref. [8] we discussed the case 



B(x,y) = -x(l - y) 



1 _ x(l-y) 



2c 



c> (30) 
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Figure 3. F(x), as a function of D for the quadratic utility function: n = 100 and 
c = 0.1. 



of a quadratic utility both in x and y. This is an interesting case, both be- 
cause such an utility function can be motivated [8] and because the system 
possesses two NE's. As a rough motivation, we can go back to the firms 
problem with b(y) = —1 + y, and argue that their utility itj may not exactly 
equal their net gain Xi{l — x) since this is then subject to taxes. If taxes do 
not grow linearly with the income (as is usually the case) we need to add 
a further term to the utility. The simplest choice leads to the above form 
of B(x,y). 

Let us go through the above passages for this example: Eq. (6) has one 
only solution for c > 1/4 which is at 

X ° = n~^l ' (31) 
which is stable Vc > 0. For c < 1/4 two other solutions appear at 



x ± = o ( 32 ) 



of which only x_ is stable. These two NE's have a quite different nature. 
The NE at xq is a competitive NE since B2 -C 1 and it is characterized by 
a small payoff per player m ~ 1/n and by large fluctuations (dxf) ~ n/D. 
The NE at x_ is Pareto dominant? since it has a finite and positive utility. 
Also fluctuations are finite, as n — > 00, at x_. At this NE the action of 
players is limited by the increase of the non-linear term due to "taxes" . 



3 The NE with largest utility is called Pareto dominant equilibrium in game theory. 
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The function F(x) is readily computed. Setting, for convenience F(x + ) 
0, we find 



„, , (l-4c) 3 / 2 (l-4c) 3 / 2 l , 
F(x_) = -- ^ — + - ^ — —log 



6c 



6c n 



I -2c- VI - 4c 
1 - 2c + VT^4c 



-+0(n- 2 ) 
n 



and 



, l-6c + 6c 2 -(l-4c) 3 / 2 1 - 6c - 6c 2 - (1 - 4c) 3 / 2 1 
F{X ° )= 12c" Uc n + 



log 



n- 



1 - 2c - VI -4c 



2c 



- + 0(n- 2 ). 
n 



The function F(x) is plotted in figure 3 for c = 0.1 as a function of D. 
The "entropic" contribution, S(x) is of order 1/n. This is a consequence of 
the fact that B30 = in this model. Since #2,1(2/, y) = —2(1 — y)/c < 0, 
fluctuations in increase the probability of the state Xq. Indeed for large 
enough D, figure 3 shows that the state at xo has a higher probability. 
Therefore fluctuations in this case decrease the probability of the Pareto 
dominant NE. Note also that, as D — ► and n — ► 00, F(xo) < for 
c > 2/9 = 0.2222 . . ., which implies that the probability to find the system 
in the Pareto dominant NE x_ tends to 0. This example shows that the 
system does not always choose the state of maximum utility. In addition, 
when D > one stable state can become unstable. In our example, for 
higher values of D or c the state at x_ which is a minimum of U{x) is no 
more a minimum of F(x). 

In this system, however, entropic effects are "accidentally" small because 
-E^o = 0. If one adds a higher order term the situation changes. Consider 
indeed 



B(x,y) = -x(l-y) 



1 — 6c 3 . . b o , . o 
l-^^x(l-y)--x 3 (l-yf 



O0. (33) 



For < b < -4 this has the same stable equilibria as before 4 . The picture, 
in the limit n — > 00, is qualitatively the same apart from the entropy, which 
now is finite since B^^{x,x) = 66x(l — x) 4 7^ 0. Note that i?2,i < 0, and 
B^fl > 0. Therefore the effects of fluctuations are opposite to the ones 
discussed above. Indeed S(xq) < S(x-), which means that the stochastic 
force favours, in this case, the Pareto dominant NE x_ over the state xo- 
Contrary to our intuition from statistical mechanics, it is the "ordered" 



4 For b < the system is unstable and for b > 4c 3 , 
new equilibria appear. 



X- becomes unstable and two 
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Figure 4- F(x), as a function of D for the utility function (33): n = 100, c = 0.25 and 
b = 0.1. 

state X- which has a larger contribution from the stochastic term. This is 
shown in figure 4, which shows in particular that fluctuations lead to a new 
minimum of F(x): This meta-stable state is a precursor of the NE at x_. 
This example shows that the identification of S(x) with an entropy can be 
misleading. 

Direct numerical simulations of the Langevin equation gives results in 
good agreement with this picture for small values of D. The higher order 
terms in the D expansion seem to enhance the behavior discussed above 
for the two particular models. 

For a particle in a random potential F(x) subject to a stochastic force 
of strength T = D/n + Dq, the transition times from one metastable state 
to the other are of the order of r ~ exp{n[i ? (xj) — F(x+)]/(D + uDq)} 
where i = 0, — labels the state of departure. The generalization of this 
result to our case, predicts relaxation times that, for Dq = diverge in the 
"thermodynamic" limit n — ► oo. This behavior is reminiscent of a first order 
phase transition in statistical mechanics. It is worth to remind, however, 
that the transition from one state to the other is a far from equilibrium pro- 
cess, whereas we derived eq. (29) within an approximation which is valid to 
order D close to the equilibria (see eq. 27). For this reason we performed 
numerical simulations of the above bi-stable systems. Even though a sys- 
tematic quantitative computation of transition times was too demanding, 
we definitely found that numerical simulations are in qualitative agreement 
with the picture emerging from the O(D) approximation. 

7. Conclusions 

We have introduced a simple stochastic dynamics for game theory. This 
assumes "local" rationality since any player tries to climb the gradient of 
his utility function. This deterministic process is affected by a stochastic 
force which represents deviation from rationality in the form of a "heat 
bath". We focused on particular games with a global interaction which 
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is typical of socio-economic systems: each player's utility depends on his 
strategy as well as on a global quantity. The stable states of this dynamics 
coincide with the NE. We studied the gaussian fluctuations around these 
NE and established that competitive equilibria are characterized by large 
fluctuations which grow with the number of players. Large fluctuations 
imply great inequalities in the distribution of utility among players. Uneven 
distribution of goods is, unfortunately, very common in the economic world. 
Our analysis suggests that this is related to the competitive nature of the 
Nash equilibrium. Players competing for a common resource have broadly 
distributed utilities whereas players whose strategy is bounded by a direct 
utility loss (as in the tragedy of commons with taxes) have more or less the 
same payoffs. Fluctuations usually decrease the utility of players, but cases 
where the contrary holds are also possible. Finally we studied the general 
case in a small noise limit. We found that, depending on the particular, 
game, fluctuations can either increase or decrease the dominance of a Pareto 
dominant state and that new metastable states can occur. 

This approach can naturally be extended to games with a discrete 
strategy space. For these, the Langevin dynamics can be replaced by e.g. 
Metropolis dynamics where each player tries to minimize his own cost func- 
tion. 

Fluctuations and deviation from rationality are inevitable in the real 
world. Reducing their strength Di costs time and money. This suggests a 
generalization of our work where Di is considered as a parameter in the 
strategy space. If one then assumes that players have "local" rationality so 
that the best thing they can do is to climb their utility gradient, one is left 
with a system where the strategy of each player consists in the choice of 
the strength of the fluctuations Di and the rate Tj at which they climb the 
potential. In terms of these parameters (D, V) we can define a generalized 
utility function 

Ui{D,f) = (uiix^x-^D^ + UoiDuTi), (34) 

where the first term is the average utility discussed in the body of the 
paper. The second term is instead related to the cost of achieving a noise 
reduction to strength Di and a rate of convergence Tj. In general we expect 
Uq to be a decreasing function of Di. Furthermore infinite precision Di = 
most likely requires an infinite cost. A possible candidate for —Uq is the 
entropy Uq(D) oc (log P(r))) = log \f~D + C. In general we found that the 
average utility decreases with increasing D. In these cases Nash equilibria, 
in the strategy space (D, f) will occur for Di > 0. In the particular cases 
where we found that (ui(xi, X-i)) increases with Di, a large noise strength 
will be preferred to a more rational behavior. This new approach would 
provide a more realistic description of real systems of interacting players. 
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